[Serve] Optimize pack scheduling from O(replicas × total_replicas) to O(replicas × nodes) by abrarsheikh · Pull Request #60806 · ray-project/ray

abrarsheikh · 2026-02-06T09:11:26Z

_schedule_with_pack_strategy was calling _get_available_resources_per_node() and _get_node_to_running_replicas() per replica being scheduled. Each call iterates over all launching + running replicas across all deployments. With 2048 replicas, this produced O(replicas²) work.
Compute both once before the loop and update available_resources_per_node incrementally by subtracting the scheduled replica's resources from the target node after each placement.
The incremental update is slightly conservative (subtracts from the min(GCS, calculated) result rather than only the calculated side), which is consistent with the existing best-effort semantics of _get_available_resources_per_node.

Benchmark with mocked objects

import os

os.environ["RAY_SERVE_USE_PACK_SCHEDULING_STRATEGY"] = "1"

import time

from ray._raylet import NodeID
from ray.serve._private import default_impl
from ray.serve._private.common import DeploymentID, ReplicaID
from ray.serve._private.config import ReplicaConfig
from ray.serve._private.deployment_scheduler import (
    ReplicaSchedulingRequest,
    SpreadDeploymentSchedulingPolicy,
)
from ray.serve._private.test_utils import MockActorClass, MockClusterNodeInfoCache


def dummy():
    pass


def bench(num_replicas: int, num_nodes: int, cpus_per_node: int):
    d_id = DeploymentID(name="deployment1")

    cache = MockClusterNodeInfoCache()
    for i in range(num_nodes):
        node_id = NodeID.from_random().hex()
        cache.add_node(node_id, {"CPU": cpus_per_node})

    scheduler = default_impl.create_deployment_scheduler(
        cache,
        head_node_id_override="fake-head-node-id",
        create_placement_group_fn_override=None,
    )
    scheduler.on_deployment_created(d_id, SpreadDeploymentSchedulingPolicy())
    scheduler.on_deployment_deployed(
        d_id, ReplicaConfig.create(dummy, ray_actor_options={"num_cpus": 1})
    )

    requests = [
        ReplicaSchedulingRequest(
            replica_id=ReplicaID(unique_id=f"r{i}", deployment_id=d_id),
            actor_def=MockActorClass(),
            actor_resources={"CPU": 1},
            actor_options={},
            actor_init_args=(),
            on_scheduled=lambda *a, **kw: None,
        )
        for i in range(num_replicas)
    ]

    start = time.perf_counter()
    scheduler.schedule(upscales={d_id: requests}, downscales={})
    elapsed = time.perf_counter() - start
    return elapsed


if __name__ == "__main__":
    configs = [
        # (replicas, nodes, cpus_per_node)
        (256, 8, 64),
        (512, 16, 64),
        (1024, 32, 64),
        (2048, 64, 64),
        (4096, 64, 128),
        (8192, 128, 128),
        (16384, 256, 128),
    ]

    print(f"{'replicas':>10} {'nodes':>6} {'cpus/node':>10} {'time (s)':>10}")
    print("-" * 42)
    for num_replicas, num_nodes, cpus in configs:
        elapsed = bench(num_replicas, num_nodes, cpus)
        print(f"{num_replicas:>10} {num_nodes:>6} {cpus:>10} {elapsed:>10.3f}")

related to #60680

… O(replicas × nodes) Signed-off-by: abrar <abrar@anyscale.com>

gemini-code-assist

Code Review

This pull request introduces a significant optimization to the pack scheduling strategy, reducing its complexity from O(replicas × total_replicas) to O(replicas × nodes). This is achieved by computing available resources and running replica mappings once before scheduling, and then incrementally updating the available resources. The changes are well-implemented and improve performance for large-scale deployments.

I have one suggestion to simplify a conditional check. Also, there's a small typo in the pull request title ('ptimize' should be 'optimize').

Great work on this optimization!

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

python/ray/serve/_private/deployment_scheduler.py

Signed-off-by: abrar <abrar@anyscale.com>

[Serve] Optimize pack scheduling from O(replicas × total_replicas) to…

76faff6

… O(replicas × nodes) Signed-off-by: abrar <abrar@anyscale.com>

abrarsheikh requested a review from a team as a code owner February 6, 2026 09:11

abrarsheikh changed the title ~~[Serve] ptimize pack scheduling from O(replicas × total_replicas) to O(replicas × nodes)~~ [Serve] Optimize pack scheduling from O(replicas × total_replicas) to O(replicas × nodes) Feb 6, 2026

abrarsheikh requested review from akyang-anyscale and ryanaoleary February 6, 2026 09:12

abrarsheikh added the go add ONLY when ready to merge, run all tests label Feb 6, 2026

gemini-code-assist bot reviewed Feb 6, 2026

View reviewed changes

cursor bot reviewed Feb 6, 2026

View reviewed changes

python/ray/serve/_private/deployment_scheduler.py Show resolved Hide resolved

dont decrement on fail

b60f662

Signed-off-by: abrar <abrar@anyscale.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Serve] Optimize pack scheduling from O(replicas × total_replicas) to O(replicas × nodes)#60806

[Serve] Optimize pack scheduling from O(replicas × total_replicas) to O(replicas × nodes)#60806
abrarsheikh wants to merge 2 commits intomasterfrom
60680-abrar-schedule

abrarsheikh commented Feb 6, 2026 •

edited

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

cursor bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

abrarsheikh commented Feb 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

abrarsheikh commented Feb 6, 2026 •

edited

Loading